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ABSTRACT 

The Kepler Mission has discovered thousands of planets with radii < 4 R0, paving the way for the 
first statistical studies of the dynamics, formation, and evolution of these sub-Neptunes and super- 
Earths. Planetary masses are an important physical property for these studies, and yet the vast 
majority of Kepler planet candidates do not have theirs measured. A key concern is therefore how 
to map the measured radii to mass estimates in this Earth-to-Neptune size range where there are no 
Solar System analogs. Previous works have derived deterministic, one-to-one relationships between 
radius and mass. However, if these planets span a range of compositions as expected, then an intrinsic 
scatter about this relationship must exist in the population. Here we present the first probabilistic 
mass-radius relationship (M-R relation) evaluated within a Bayesian framework, which both quantifies 
this intrinsic dispersion and the uncertainties on the M-R relation parameters. We analyze how the 
results depend on the radius range of the sample, and on how the masses were measured. Assuming 
that the M-R relation can be described as a power law with a dispersion that is constant and normally 
distributed, we find that M/M^ = 2.7{R/R^y-^, a scatter in mass of I.QMq, and a mass constraint 
to physically plausible densities, is the “best-fit” probabilistic M-R relation for the sample of RV- 
measured transiting sub-Neptunes {Rpi < 4 R0). More broadly, this work provides a framework for 
further analyses of the M-R relation and its probable dependencies on period and stellar properties. 
Keywords: planets and satellites: composition — methods: statistical 


1. INTRODUCTION 


The Kepler Mission has found thousands of planetary 
candi dates with sizes between that of Earth and Nep- 


tune (IMullallv et al. 120151: 

Rowe et al.ll2015 

:IBurke et alJ 

[20I4I: [Batalha et al. [2013 

: [Borucki et al.l 

imiL The 


emergence of this population poses fundamental ques¬ 
tions about the typical compositional constituents of 
planets within a few times Earth’s size. As bulk densi¬ 
ties offer some insight into this problem, mass and radius 
measurements of individual planets have provided ob¬ 
servational constraints for theoretical co mposition stud¬ 
ies p e rformed on a per-plane t basis (e.g. iValencia et al.l 
l20ICt [Rogers fc Seaeerl boiOt iLopez et alT 201 Jb Re- 
cently these studies have shifted to conside ring the avail¬ 
able planets as a statisti cal ensemble (e.g. iRogersI[20151 : 
[Wolfgang fc Lop^ 120151 sans mass constraints), which 
motivates detailed analyses of the observed mass-radius 
distribution. 

The joint planetary mass-radius distribution, which is 
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often couched in terms of the mass-radius “relationship” 
(M-R relation), is also highly relevant for dynamical and 
formation studies of the Kepler planet candidates (PCs). 
Mass measurements for individual PCs are often unavail¬ 
able, as t he majority orbit st ars too faint for Doppler 
follow-up (|Bataiha et al.l l2?Hfll l and only ^ 6% exhibit 
transit timing variations (TTVs) at high signal-to-noise 
ratios ([Mazeh et al.l[201^ . Therefore, a statistical “con¬ 
version” is necessary to map observed radii to the masses 
these studies need. 

To date, several M-R relations have been posed in 
the exoplanet lit erature. To s o lve th e practical issue 
described above, [Lissauer et al.[ ([20III! fit a power law 
to Earth and Saturn and f ound M = wher e M 

and R are in Earth units. [Wn fc LithwickI (1201,111 de¬ 
rived masses using the amplitudes of sinusoidal TTVs 
for 22 planet pairs, an d found M = 3i?. More recently, 
[Weiss fc Marcv[ (|2014il . hitherto WM14, fit a power law 
to masses and radii available in the literature, which 
was dominated by the 42 planets chosen by the Kepler 
team to be followed up with radial velocity measurements 
(|Marcv et al.[[20l3l : they found M = 2.69i?° ®^ for plan¬ 
ets with 1.5 < i? < 4 R0. 

All of these results were produced via basic least 
squares regression, which is commonly used in astronomy 
to fit lines through points. However, this classic tech¬ 
nique does not properly account for several issues that 
are relevant to the small-planet M-R relation: measure¬ 
ment uncertainty in the independent variable (i.e. planet 
radii), non-detections and upper limits, and intrinsic, as¬ 
trophysical scatter in the dependent variable (i.e. planet 
masses). Thankfully, there are solutions to these prob¬ 
lems in both the Bayesi a n and frequentist statistics lit¬ 
erature (see §1 of [Kell-d (|2007f) for a concise overview). 
We present an example of one of these techniques which 
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can be executed using existing numerical algorithms and 
code O, which is effe ctively a simplified implementa¬ 
tion of the iKellvl (I2007D linear regression scheme. 

Of particular interest is the intrinsic scatter that has 
not been previously characterized. Theoretical work 
on planet compositions suggest this scatter should ex¬ 
ist: thermally evolved rock-hydrogen sub-Neptune in¬ 
ternal structu re models yield radii mostly indepen¬ 
dent of mass (|LoDez k FortnevI 1201411 . which produces 
significant mass-radius scatter when a distribution of 
gaseous mass fractions is present in the population 
(|Wolfgang k Lod^ 120151) . Furthermore, the diversity 
of choices for exoplanets’ internal structures produces 
a range of radii at a given mass due only to differ¬ 
ences in the layers’ compositions (e.g. iSeager et al.ll2007t 
iFortnev et al.lfeoOTl : [Rogers et al.ll201llb 

These theoretical findings motivate us to move beyond 
deterministic, one-to-one mappings, which are in a sense 
“mean” relationships. This average behavior is insuffi¬ 
cient and inappropriate if one’s aim is to argue for a par¬ 
ticular physical process based on full distributions of pa¬ 
rameters (versus qualitative comparison to observations), 
or if the purpose is to rule out parts of parameter space, 
which requires knowledge of the full mass-radius distri¬ 
bution. 

Realizing the need to move beyond deterministic 
mass-radius relations fo r their own theoretical work, 
iChatteriee k TaiJ (j201,5f) derived a piecewise probabilis¬ 
tic M-R relation by htting the density distribution of 
planets in four mass bins, and then fit a continuous, yet 
still deterministic, relation to those results. However, 
they stop short of computing a relation which is both 
continuous and probabilistic (which they admit would be 
ideal), and do not incorporate measurement error, which 
is significant for small planets. With the hierarchical 
Bayesian modeling that we employ here, we do both. In 
the process, we also more fully characterize the uncer¬ 
tainty in the M-R relation based on the current data. 
The effort to understand this uncertainty is important, 
as quantifying how well constrained the M-R relation pa¬ 
rameters are will be a key metric by which we measure 
the improvement in our understanding of the M-R distri¬ 
bution, especially as TESS and its follow-up observations 
produce more individual mass and radius measurements. 

In this paper we show how a probabilistic M-R rela¬ 
tion can be constructed ((J 2 ]) and constrained ((Q using 
any subset of planetary masses and radii ((J3]). We also 
highlight the observational evidence for this expected in¬ 
trinsic scatter and quantify it in a statistically robust 
way that includes uncertainties on the M-R relation pa¬ 
rameters ((jni). We discuss the correct usage and some 
major implications of these findings in 

2 . MODELING THE M-R RELATION 

Power laws are often used to parameterize the M-R re¬ 
lation because they are conceptually and computation¬ 
ally simple and can be easily fit to data using the familiar 
tool of linear regression. We continue with this choice 
to facilitate more direct comparisons with previous work 
and to illustrate how a hierarchical framework enables 
straightforward extensions to entire families of M-R rela¬ 
tions. In addition, we cast this in terms of M{R) instead 
of R{M) to address the practical problem of estimating 
masses from Kepler radii. 



Figure 1. Graphical model used to find the best-fit parameters 
for the probabilistic mass-radius relationship in Eqn[2l These pa¬ 
rameters of interest are yellow while the observed data are gray (see 
^ and unobserved parameters are white; definitions are below. In 
practice, we summarize each planet’s “RV vs. time” dataset as the 
mass measurement and the uncertainty in that measurement, 

(i) 

^Mo6’ similarly, the “Flux vs. time” dataset is summarized as the 
radius measurement and its uncertainty contains 

further discussion of this choice. Our full hierarchical model, which 
includes the details of the probability distributions from which each 
parameter is drawn, is displayed in Equation [4| 

a = population-wide radius distribution parameters 
C = constant in mean M-R relation 
'y = power law index of mean M-R relation 
(JM = intrinsic dispersion in planet masses at a given radius 
(i) 

RJ ^ = true radius of the ith planet 
(i) 

= observed radius of the ith planet 

^^Rob ~ measurement uncertainty in 
(i) 

^ = true mass of the ith planet 
(i) 

= observed mass of the ith planet 

ii) (i) 

cTjJoh ~ measurement uncertainty in 


In particular, we consider three power law-based M-R 
relations (Eqns [TUS]) . The first is the form used by most 
prior studies (see m- 


M R W 

Me \R^J 


( 1 ) 


where M is the mass of the planet, R is the planetary 
radius, and C and 7 are the parameters to be fit to the 
data. This relation is deterministic in the sense that only 
one mass is allowed for a given radius. 

If instead we want to allow for a range — that is, if 
we want to incorporate the expected intrinsic scatter — 
then we need to create an M-R relation which specifies 
how those masses should be distributed at a given in¬ 
put radius. Again, taking the most simple, familiar, and 
analytically tractable approach, we choose a Gaussian 
distribution, where the mean population mass p, is given 
by the above power-law relation and where the standard 
deviation ctm (units of Mq) parameterizes the intrinsic 
scatter in planet masses: 


T\/T / / t? \ '*y \ 

— ~Normal(M = c(—) .c = n) (2) 

Note that ~ means “drawn from the distribution”, 
thereby marking the difference between a deterministic 
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and a probabilistic M-R relation. Figure [T] is the graphi¬ 
cal model corresponding to Eqn[^ and includes Gaussian 
error bars on the measured masses and radii (see (|4] for 
all details of the model). 

Generalizing further, the width of the intrinsic scatter 
may change as planets increase in size, so we consider a 
probabilistic M-R relation that allows the standard de¬ 
viation itself to vary as a function of radius via the slope 
(3 (units of M^R^^): 

^^Normal(M = c(A)V= (3) 

where R = R/R<^ — 1 and ctmi is now the standard de¬ 
viation in planet masses at 1 R 0 (i? = 0 ). 

3. DATA 

With the statistical M-R relations defined, we turn to 
the problem of identifying which observational dataset 
to use. Optimally we would use a subset of mass and 
radius measurements that is uniform and complete, as 
any systematic biases present in the sample will mani¬ 
fest as biased M-R parameter values. Unfortunately, the 
available masses and radii are far from this ideal, with 
mass measurements made with two fundamentally dif¬ 
ferent methods by many different pipelines and chosen 
for follow-up by a complex, poorly documented selection 
function. There is significant work to be done to un¬ 
derstand how these systematics affect the M-R relation, 
but it is outside the scope of this paper, as our main 
purpose is to show how a probabilistic M-R relation can 
be derived from whichever dataset one wishes to use. 
Therefore, we choose a baseline dataset consisting of ra¬ 
dial velocity-measured masses, which somewhat reduces 
the heterogeneity of the sample while preserving a fairly 
large number of data points. 

Table [2] shows all of the masses and radii that we con¬ 
sider, with our baseline dataset denoted with a label of 
0; the list was constructed by starting with the WM14 
dataset and identifying new planets and updates in the 
NASA Exoplanet Archive (last accessed 1/30/2015). We 
manually double-checked each planet to verify that the 
reported measurements were correct and most up-to- 
date, paying particular attention to which methods and 
stellar parameters were used (data denoted by a label 
of 1 were present in and haven’t changed since WM14). 
Given the above concerns with dataset heterogeneity, 
when both TTV and RV masses are independently avail¬ 
able for a single planet, we choose the RV-measured 
mass es. In practice, only Kepler-18b (iGochran et al.l 
120111 ) provide strong enough mass constraints from both 
methods to require a choice to be made, and even then 
the two mass measurements are consistent. The TTV 
dataset (label of 2) contains only the sub-Neptune-sized 
planets that have had their transit timing variations fit 
with N-body integrations, as these masses are the best 
constrained and therefore provide the most information 
for the sub-Neptune M-R relation; neither circumbinary 
planets nor unconfirmed planets were included, again to 
try to keep a somewhat more homogeneous dataset. Fi¬ 
nally, to enable easier comparison with previous work, 
we continued the error treatment of WM14: if asymmet¬ 
ric upper and lower uncertainties were reported, we used 


their average as a symmetric Icr error bai0. 2a upper 
limits were included if they were < 80 M® for i? < 4 R® 
and < 300 Mq for 4 < i? < 8 R®. 

4. FITTING THE M-R RELATIONS 

We use hierarchical Bayesian modeling (HBM) to 
ht the M-R relations in to the data described in 
m_ This statistic al met hod is described in detail in 
iWolfgang fc LoDe'3 (|2015D in the context of exoplanet 
compositions; further pedagogical discussion and exam- 
ples of HBM i n the astronomical literature is provided by 
iLoredol (|2013ll . A very similar approach to this HBM - 
enabled linear regression was detailed in iKellvl (1200711 ; 
we refer the reader to that paper for an in-depth discus¬ 
sion of the general advantages and improvements of this 
approach over the commonly used analysis for linear 
regression. 

For the problem at hand, HBM (or the analogous fre- 
quentist methods for multi-level modeling) is necessary 
for a number of reasons: 

• It allows us to directly model and fit the astro- 
physical dispersion in the population as an explicit 
parameter. 

• It allows us to self-consistently incorporate uncer¬ 
tainties on the independent variable (radii in this 
case), without the need for elaborate bootstrapping 
schemes. 

• Most sub-Neptune mass uncertainties are large, 
and some are realistically only upper limits. HBM 
is able to simultaneously use all likelihood distri¬ 
butions no matter their width or shape, which in¬ 
creases the information content of the resulting M- 
R relation and decreases the biases that binning or 
weighting schemes introduce when these likelihoods 
are asymmetric. 

• Relatedly, HBM allows us to introduce the true 
masses and radii as latent (unobserved) parame¬ 
ters; this enables us to restrict the masses to phys¬ 
ically allowed parameter space (such as M > 0 or 
P < PironiM)) while preserving all of the informa¬ 
tion in the observations, including when the “best- 
fit” masses happen to be negative. 

• As with all Bayesian methods, HBM produces pos¬ 
terior distributions, allowing us to easily see the un¬ 
certainties in the M-R relation parameters. Most 
of the M-R relations currently reported and used 
in the literature have no published uncertainties. 

The hierarchical model for our baseline M-R relation 
(Eqn[2]) is displayed in Figure [T] to clarify the structural 
relationships between parameters and observables. This 
structure is also present in the written version below, 
along with details of the distributions we used (“N” rep¬ 
resents a normal distribution with the listed parameters 
in order of p, and cr; “U” represents a uniform distribu¬ 
tion with the listed numbers bounding the interval; and 

® Future work using HBM can improve on this error treatment 
by using the full posteriors of the mass and radius measure men ts, 
if these posteriors are made available in the literature (see lOll . 
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Deterministic M-R Relation (Eqn 1) 

Probabilistic M-R Relation with Constant Scatter (Eqn 2) 
Probabilistic M-R Relation with Non-Constant Scatter (Eqn 3) 



1 2 3 4 5 6 0.5 1.0 1.5 2.0 2.5 1 2 3 4 5 6 

Constant C (MEarth) Power-law Index y Scatter at Earth’s size (MEarth) 


Figure 2. Posteriors for the parameters in our family of M-R relations (row 1: Equations msi row 2: Equations lasl row 3 and gray 
posterior samples in all panels, which contain all 10,000 saved samples from our thinned MCMC chains: EquationlSj when fit to our baseline 
dataset. 68% and 95% contours are shown for each, and demarcate the uncertainties on these M-R relation parameters; the triangles denote 
best-fit values. Panels b and c show that = 0 is strongly excluded for R < A R0, and so astrophysical scatter exists in the sub-Neptune 
M-R relation. Therefore, theoretical studies which require an M-R relation should use a probabilistic one like that of Eqn[2]with one of the 
sets of parameter values in Tabled 


“I” means “given”, i.e. the parameter to the left depends 
on the parameters to the right): 


7 

ln(C) 


log(cr^ 


M) 


(d 


R. 

n(d|n(d (d 


N(l,l) 

U(-3,3) 

U(-4,2) 

U(ai = 0.1,02 = 10) 
7 ln(i?«)+ln(C) 
N(e^« ,tTM) 

N(M«,4U) (4) 


For the deterministic M-R relation of Eqn [U Eqn 2] re¬ 
mains the same except there is no ctm parameter, and 


M«|i?«,C',7 = e'^M 

while for the M-R relation of Eqn [3l there was an addi¬ 


tional parameter /3 such that: 

/3 ~U(-10,10) 

^ c, 7, cTMi, /3 N(e'^M , 

Note that the normal distributions in the last two lines 
of the model (collectively Eqn[4]) are the same likelihoods 
that are assumed when using to perform linear regres¬ 
sion. 

For all M-R relations we consider, we truncated the 
distribution such that 0 < where 

^tpureFe Computed using the 0% rock mass frac¬ 
tion analytic fits to the iFortnev et al.l (|2007f) rock-iron 
internal structure models: 




—h + \Jh'^ — 4a(c — 
2a 


(5) 


where a = 0.0975 , b= 0.4938, and c=0.7932 

(|Fortnev et al.ll^OTbll . This truncation was imposed to 




















































Scatter in the Sub-Neptune Mass-Radius Relationship 


5 


restrict the planet masses to physically plausible densi¬ 
ties given the planet’s true radius. 

We also performed a prior sensitivity analysis on our 
population-wide parameters to assess the degree to which 
our results depend on our priors. As given in EqnUl we 
used a wide normal distribution for the power law in¬ 
dex; we chose this prior to be wide with a variance of 1 
to minimize the amount of information in the posterior 
that comes from the prior, and we chose a mean of 1 be¬ 
cause we expected a priori that near-linear relationships 
were physically plausible. This choice was admittedly 
arbitrary, so we tested the sensitivity of our results to 
these assumptions. To do this, we ran the MCMC again, 
but with two other priors: a uniform distribution for 7 
and a uniform distribution for s where 7 = tan(s), which 
corresponds to a uniform distribution in the slope of the 
power lavEB- Under each prior, we find that the posterior 
modes (the “best fits”) for (7,7, and ctm differed by no 
more than 0.05 from the best-fits listed in Table[Tl which 
is below our reported precision. Additionally, we tested 
several end member cases for the Rt distribution, and 
the choice for this prior had a similarly negligible effect 
on the result, primarily because Rot is fairly well con¬ 
strained throughout the sample. Therefore, we conclude 
that our results are robust to our choice of priors. 

To produce the results shown in m we evaluate 
each model wit h JAGS (Just Another Gibbs Sampler; 
iPlummeil 1200311 . an R code for numerically evaluating 
hierarchical Bayesian models with MCMC. For each set 
of posteriors in Figures [2] and [H we ran 10 chains con¬ 
sisting of 500,000 iterations each. The first half of each 
chain is discarded as “burn-in”, and the resulting half is 
thinned by a factor of 250, such that we retain 10,000 
posteriors samples of each parameter. 

To assess the independence of these samples, we com¬ 
pute the effective sample size (FSS), which accounts for 
the autocorrelation still present within these thinned 
Markov chains (ESS = 10000 indicates perfect indepen¬ 
dence). The ESS is > 4000 for each parameter listed 
in Table [ll with two exceptions. The ESS of the deter¬ 
ministic relation parameters are around 230, an order of 
magnitude lower than all the probabilistic relations we 
tested. The difficulty this model had with convergence 
reflects the challenges of applying a physically inappro¬ 
priate model to data and is another indication that a 
deterministic relation does not fit the observed masses 
and radii well. Less concerning yet not quite as well con¬ 
verged compared to the others were the parameters for 
the probabilistic M-R relation fit only to the smallest 
radii (ESS = 1500 — 4000). This occurs because these 
small planets have the largest mass uncertainties; this 
causes the maximum mass restriction in Eqn[5]to severely 
truncate most of the likelihoods, which results in high 
autocorrelation in the MGMG chains. Given the small 
ESS for these two sets of parameters, we caution against 
over-interpretation of their results: their “best-fit” val¬ 
ues in Table [ 1 ] are ^ 6 and ~ 2 times more uncertain than 
the others (corresponding to the precision of the poste¬ 
rior mean with the square root of the sample size [the 
ESS]), and the boundaries of their 95% credible regions 
in Figure [3] are poorly estimated. We do not spend time 

Uniform 7 places high probability at steep power laws, which 
are highly unlikely on physical grounds. 


Table 1 

Best-Fit Parameters of the M-R Relation 


Equation 

Dataset 

c 

7 

<^M 


1 

baseline: RV only, < 4 R 0 

2.1 

1.5 

— 

— 

2 

baseline: RV only, < 4 R^ 

2.7 

1.3 

1.9 

— 

2 

N-body TTVs only, < 4 Rq 

0.6 

1.7 

1.7 

— 

2 

Weiss (< 4 R 0 ) 

2.8 

0.9 

2.5 

— 

2 

RV only, <1.6 R® 

1.4 

2.3 

0.0 

— 

2 

RV only, < 8 R 0 

1.6 

1.8 

2.9 

— 

3 

baseline: RV only, < 4 R 0 

2.6 

1.3 

2.1 

1.5 


Note. — These “best fit” values correspond to the mode of 
the joint posterior distributions; see code and posterior samples in 
the github repository dawolfgang/MRrelation to account for the 
full u ncer tainty in the parameters that is contained the posteriors 
(see E3] for more details on this). Also, when using these M-R 
relations to generate masses from planet radii, one should apply 
the density constraint given by Eqn[5] 

running these simulations longer, as they were performed 
for the sake of comparison, and we display them for this 
purpose only. 

For the main result — the baseline dataset evaluated 
with the probabilistic M-R relation — the ESS of C 
and 7 are 10,000, and the ESS of ctm is 5,300. Fur- 
th ermore, the between-ch ain convergence diagnostic R 
of iGelman fc RubinI (|1992f) is < 1.002 for all parameters 
in our probabilistic models (except again for the small¬ 
est radii planets, for which R = 1.008 at its worst). 
Together, these two tests provide no evidence that the 
posteriors have not converged, and we proceed with the 
usual amount of confidence (given that no one can ever 
prove convergence). 

5. RESULTS 

Table □ shows the results of our modeling: it displays 
the best-fit parameters for each of the various M-R rela¬ 
tions and datasets that we consider. The first entry cor¬ 
responds to the deterministic M-R relation; entries 2 — 6 
correspond to our probabilistic M-R relation for various 
datasets (see [JS] 1 15.2p : and the last entry corresponds to 
the probabilistic M-R relation with non-constant scatter. 
In particular, the second entry lists the best-fit values 
for Eqn[3]using our baseline dataset. All were computed 
with the density restriction given by Eqn [S] this con¬ 
straint should also be applied to the masses generated 
from these M-R relations when these relations are used 
in forward modeling. 

In all cases the reported “best fit” values correspond 
to the mode of the joint posterior distribution, and are 
denoted by the triangles in Figures [211S1 The uncertain¬ 
ties in the parameters are represented by the displayed 
68 % and 95% posterior contours, with the contours corre¬ 
sponding to our baseline dataset colored blue. The gray 
points are the 10,000 saved posterior samples from our 
thinned MCMC chains using the baseline dataset. 

5.1. Deterministic vs. Probabilistic M-R Relations 

The primary motivation for this paper was to assess 
the observational evidence for intrinsic scatter in the sub- 
Neptune M-R relation, and to characterize this scatter if 
warranted. To do so, we compare the posteriors for our 
three M-R relations in Figure [2| (note that not all rela¬ 
tions have all parameters: for example, the deterministic 
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Figure 3. Posteriors for Eqn[2ls M-R relation parameters when we change the input dataset ( 68 % and 95% contours shown; triangles are 
best-fit values). The blue contours represent the baseline dataset and are the same as those in panels a and b of Figure [2] the gray points 
are the 10,000 saved posterior samples from our MCMC chains using this baseline dataset. The green TTV M-R relation is systematically 
shifted downward (lower C) compared to the baseline M-R relation, while the red WM14 dataset, a hybrid of the two, produces a posterior 
which falls between them (the black point is the WM14 result itself). When we consider different radius ranges, we see that Robs < 8 
R 0 (cyan) produces a slightly down-shifted, steeper, and more dispersed M-R relation than the default Robs < 4 R 0 (lower C and higher 
7 , (TM? although the posteriors do overlap), while the M-R relation for Robs <1-6 R 0 (orange) is not well constrained (although cjm ~ 0 
for reasonable values of C). 


M-R relation of Eqn [T] is described only by C and 7 , 
so it only appears in panel a). Panels b and c show 
that this intrinsic scatter exists: because the posteriors 
lie away from zero, um = 0 is strongly excluded by the 
data, even with the currently large individual mass error 
bars. This is not a result of our choice of priors: the 
parameterization in Eqn|4]is equivalent to ~ I/o’mi 
which is strongly weighted toward zero, in contrast to 
the posterior we compute. 

Comparing the different M-R relations, we see that 
the C, 7 posterior for the model given by Eqn [T] is much 
tighter than that for Eqns [2][3l This is expected: when 
we keep the dataset fixed but add more parameters, es¬ 
pecially one like ctm that by construction allows wiggle 
room around a deterministic relation, the observational 
information content per parameter decreases, and the 
posteriors widen. Given this expectation, what is ar¬ 
guably more notable are the small differences between 
Eqn [2] and [Us model posteriors for the parameters they 
have in common: most of the extra width of Eqn |3ls 
joint posterior is contained in the new parameter /3 (Fig¬ 
ure O panels d-f), which spans zero. There is therefore 
not enough evidence in the current dataset to justify an 
intrinsic scatter that changes as a function of radius, at 


least not under our model assumption^HI. For the best- 
fit values of these parameters, which correspond to the 
triangles in Figured see Tabled] 

5.2. Changing the Dataset 

The results in US. H are for our baseline dataset, an RV- 
only sample with Robs < 4 R 0 . However, all Bayesian 
results depend on the data that are used, so it is impor¬ 
tant to carefully consider what the dataset contains. To 
demonstrate this, we present some illustrative examples 
of the M-R relation posteriors under different mass and 
radius selection functions (Figure |3|). 

The left side of Figure |3| displays results for sam¬ 
ples of planets that have had their masses measured 
in different ways. A n umber of prior studies (e.g. 
I.Tontof-Hutter et al.ll20l4 WM14) have noted that plan¬ 
ets with masses measured from their high SNR TTVs 
tend to be systematically less dense than planets with 

While outside the scope of this paper, future analyses of the 
M-R relation can address this and other questions of model se¬ 
lection more quantitatively by computing posterior Bayes factors. 
Regardless, the results for the statistical models represented by 
Eqns [ 1 ] and [3] can serve as a sensitivity test for that of Eqn [2] as 
we describe. 
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Best-Fit M-R Relations 


Individual Mass and Radius Posteriors 



0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 

Radius (REarth) Radius (REarth) 


Figure 4. Left: the best-fit M-R relations from the right column of Figure [3] For each, the solid line denotes the mean relation fiM 
while the faded region denotes the standard deviation of the intrinsic scatter (vertical height of region to either side of line = ctm', note 
= 0 for the smallest planets). The M-R relation of WM14 is the dashed black line while the baseline dataset is overplotted as the 
thin black lines with triangles for the 2-cr upper limits^ote that WM14 was calculated with a dataset that includes TTV planets). Right: 
the baseline M-R relation (the second entry in Tabled marginalized over the corresponding posterior distribution and subjected to our 
physical mass range restriction. The blue region now corresponds to the central 68% of planet masses that were drawn at a given radius. 
The 68% coverage interval of the posterior true masses and radii of individual planets are plotted red; the same and as on the left 
are plotted in gray for comparison. 


RV-measured inasset0. The origin of this difference is 
still unclear, and is not something our current modeling 
is able to address (see 1 16.3I) : it could be due to either an 
intrinsic difference in the densities of these two popula¬ 
tions, or to observational bias, as TTVs for larger (and 
thus less dense) planets are easier to detect while RVs for 
more massive (and thus more dense) planets are easier 
to measure. In any case, our results confirm the exis¬ 
tence and nature of this discrepancy, if not the reason: 
the green TTV-only posterior is shifted towards lower 
C with similar 7 and (tm, which produces on average 
lower masses for a given radius. Furthermore, the hy¬ 
brid WM14 dataset yields the red posterior, which falls 
between the TTV-only and RV-only posteriors yet peaks 
at lower 7 , illustrating that posterior modes (Bayesian 
“best fits”) for joint datasets are not necessarily aver¬ 
ages of the modes for separate subsets. This behavior 
can be understood when one considers that these TTV- 
measured planets are preferentially larger than the RV- 
measured planets: this pulls the joint M-R relation down 

This appears to be a population-level effect: there are few 
planets with independently analyzed, strongly constrained TTV 
and RV masses, and they all yield measureme nts that are consisten t 
between the two methods [Kepler -18b &c c IlCochran ejTalJ jgOIT]!. 
Keple r-88c IINesvornv et al.l 120131 . and Kepler-ll? lIBruno et alj 
1201511 : note that only Kepler-18b appears in our table]. There are 
other systems where the two methods have been used in concert 
to infer planet masses (e.g., Kepler-9, Kepler-10 Kepler-11, Kepler- 
89; see table for references), but the constraints from one or both 
methods are relatively weak, making them insensitive tests of a 
TTV-RV difference on a per-planet basis. 


at higher radii because the TTV-measured planets there 
have lower masses (which lowers 7 ) but affects the rela¬ 
tion at lower radii very little because there are few small 
planets in our TTV-measured sample (which keeps C 
roughly the same). 

The right side of Figure [3] displays results for sam¬ 
ples of planets spanning different radius ranges, illus¬ 
trating the effect that a somewhat arbitrary radius cut 
can have on one’s results. Compared to our default 
sub-Neptune range, a Rots < 8 Re cut produces an 
M-R relation that is overall shifted down, is steeper, 
and has more intrinsic scatter (the cyan posterior has 
lower C and h i gher 7 , ctm). This is consistent with the 
iLissauer et al.l (|201lD fit to Earth and Saturn over a sim¬ 
ilar radius range, although neither of these Solar System 
planets were included in our dataset. Meanwhile, the M- 
R relation is poorly constrained for the Robs <1-6 Rq 
sample, the radius r ange outside of which rocky planets 
likely do not occur (|R.ogersl [20151 ). This is because our 

0 < restriction is most severe for these 

small planets, allowing only a small range of physically 
plausible masses. This range is completely spanned by 
most of the mass measurements (see right side of Figure 
a, so there is little empirical extrasolar information for 
Robs < 1-2 R 0 , and the orange posteriors are dominated 
by the few larger planets with well measured masses. 
With this sample, there is not currently enough obser¬ 
vational evidence in this radius range to rule out a de¬ 
terministic relation. This does not mean, however, that 
these small planets all have the same composition, as the 
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posterior 68 % contour spans power-law indexes between 
1 and 3 and a constant population-wide Eartti-like rocky 
composition would h ave a power-law index of around 3.7 
(jValencia et al.l[20?i^ . More data will be needed to yield 
a better estimate of the power-law index, and therefore 
of the compositional diversity of small planets. 

5.3. Visualizing the M-R Relation 

While the posterior contours in Figures [US] show the 
best-fit M-R relation parameters and their uncertainties, 
visualizing the M-R relation itself requires that they be 
mapped from parameter space to mass, radius space. 
There are at least two ways to do this with Bayesian 
analysis, and they are displayed in Figure ID 

First, one can simply take the best-fit values and plot 
the resulting relation, as was done in the left panel. Here 
the 1(7 width of the probabilistic relation, as parameter¬ 
ized by (Jm 1 is denoted by the faded colored region while 
the mean relation, as parameterized by C and 7 , is the 
thick line of the same color. Note that the mean M-R 
relations extend into unphysical regimes for R < 1 R-©; 
this is because the mass observ ation s span the physically 
allowed region, as discussed in >15.21 leaving the M-R re¬ 
lation to be constrained primarily by the locations of the 
larger, higher mass planets and our model assumptions. 
The presence of intrinsic scatter in our M-R relation nev¬ 
ertheless allows physically realistic masses to be assigned 
to the smallest planets; to force this requirement, we rec¬ 
ommend adding a density constraint to Eqn[2]such that 
the probability of a planet being drawn outside this range 
is 0 (the constraint we used is given in Eqn[5]), or to use a 
different M-R relation for sub-Earth-sized planets. The 
different colors in the left panel correspond to the M-R re¬ 
lations in the right column of Eigure[3l these mostly over¬ 
lap in the sub-Neptune regime. Note that the RV-only 
dataset produces a steeper relation than one which also 
contains high SNR TTV planets (i.e. the black dashed 
WM14 relation), as discussed in >15.21 

While these best-fit M-R relations are easy to use, they 
do not take into account the fact that the posteriors have 
non-zero width and therefore a range of M-R relation pa¬ 
rameters are allowed by any one dataset. A more thor¬ 
ough implementation of these results would account for 
these uncertainties by integrating over all of the posterior 
samples. This marginalization, which also incorporates 
the physical restrictions on Mt as given by Eqn[5l is dis¬ 
played on the right: now the blue region corresponds to 
the central 68 % of planet masses that were drawn for a 
given radius. Note that this region is wider than that 
on the left and that the masses no longer extend into 
unphysical regimes. The 68 % coverage interval of the 
posterior true masses and radii of individual planets in 
the baseline sample are plotted red, while the same Rob 
and Mob as on the left are plotted in gray. As expected 
(see the end of > 16.21) . the posteriors have “shrunk” toward 
the mean relation within the uncertainties provided by 
the data. Eurthermore, one can readily see that the data 
are qualitatively consistent with the modeled M-R rela¬ 
tion: the red lines fall within and immediately around 
the blue region (see > 16.21 for a quantitative treatment of 
model checking). 

6 . DISCUSSION 


6.1. Using the M-R Relation to Predict Masses 

The most straightforward and computationally simple 
way to map a sub-Neptune’s radius to a mass while ac¬ 
counting for intrinsic scatter is to adopt Eqn[2]with one 
of the sets of parameters in Table [T] and impose a den¬ 
sity constraint like Eqn[5]for the smallest planets. This 
best-fit M-R relation is analytic and represents a sub¬ 
stantial improvement over the previous deterministic re¬ 
lationships in capturing the full mass-radius distribution. 
However, it does not incorporate uncertainties in the M- 
R relation parameters or uncertainties in the measured 
planet radius itself. Depending on how detailed one’s 
analysis needs to be, a more accurate predictive mass 
distribution may be needed. 

To account for these issues, one must compute the pos¬ 
terior predictive M-R relation, which marginalizes over 
both the posteriors displayed here and the radius poste¬ 
rior produced by one’s light curve modeling. This mass 
distribution will be wider than that produced by simply 
applying Eqn [2] (see right side of Figure a because it 
incorporates the above sources of uncertainty and thus 
more accurately reflects our state of knowledge about 
these planets’ masses. Kepler-452 b (|Jenkins et al.ll2015[l 
provides an example of an individual planet’s posterior 
predictive mass distribution that has been calculated 
with this probabilistic M-R relation; because its com¬ 
putation requires the numerical posterior samples that 
we have produced, the resulting mass distribution is also 
numerical in nature. To enable more calculations like 
this one, we have posted our posterior samples in the 
github repository dawolfgang/MRrelation along with R 
code that uses them to compute and plot the posterior 
predictive mass distribution for individual planets. 

6.2. Model Checking 

The purpose of the right panel in Figure 4 is to provide 
a qualitative comparison between the data and the base¬ 
line probabilistic M-R relation; this visual check imme¬ 
diately verifies that, in the broadest sense, our model is 
a reasonable description of the data (see > 15.31) . However, 
no model perfectly describes nature. A more in-depth 
look is warranted, to both understand the limitations of 
the current model and to identify areas for improvement 
in future work. 

Quantitatively, we can check the data-model con¬ 
sistency by computing a “hierarchical p-value”, which 
yields the fraction of all possible datasets which are more 
discrepant from the model than the observed dataset. 
This calculation necessarily involves sampling from our 
forward model (Eqn|4|) and using those samples to calcu¬ 
late a statistic which quantifies “discrepant”. The identi¬ 
fication of robust and useful hierarchical statistics is still 
an active area of statistical research (see, for example, 
iBavarri k Castellanosll200^ . as the multi-level nature of 
hierarchical modeling offers a number of choices that test 
different aspects of the model. As a result, model check¬ 
ing in practice can be an involved process that requires 
investigations into multiple parts of the problem. We 
provide two such investigations below to illustrate some 
of the subtlety of this endeavor. 

Regardless of the details, most forms of Bayesian model 
checking use the posterior predictive distribution. Con¬ 
ceptually, this distribution defines the probability of ob- 
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serving a certain data value given the currently observed 
dataset and the model. Mathematically, the posterior 
predictive distribution for a new observation Xnew is de¬ 
fined as: 


Pi^new\^T — jpt ^new |0, Mi)p{6\x, Mi)dO 


( 6 ) 


where p{xnew\0, Mi) is the likelihood of a new, currently 
unobserved data point given the parameters 6 of model 
Ml, and p{6\x,Mi) is the posterior of model Mi, i.e. 
the joint probability of all of Mi’s parameters given the 
observed dataset x. The integral denotes the process of 
marginalization over the parameters, so that the prob¬ 
ability of a new observation incorporates the model un¬ 
certainty allowed by the current observations. 

Next, one draws hypothetical data from this posterior 
predictive distribution until a dataset of the same size N 
as the observed dataset is achieved: 


^new ^ Pi^newl^^ Ml) (7) 

N 

where X^iew — ■•■:^new,N} and ~ 

means “draw from that same distribution N times”. This 
dataset is then used to compute the statistic of choice 
(see discussion below). Repeating this process thousands 
of times (i.e. bootstrapping the statistic) generates a dis¬ 
tribution of the statistic which can then be compared to 
the value that was calculated from the observed dataset. 
If the observed statistic falls within this distribution, the 
model is consistent with the data (see Figure 0. 

In practice, MCMC simulations (Q provide samples 
from the posterior rather than the analytic form of the 
posterior as required in Eqn [6l therefore, the posterior 
predictive distribution is not directly calculated. Instead, 
Eqns El and [7] are combined by using the posterior sam¬ 
ples: 

9 - p{e\x,Mi) 

to define the likelihood that one then draws from: 

^new Pi^ne'w\9T Ml) ( 8 ) 

Performing these two steps repeatedly produces an en¬ 
semble of datasets drawn from Eqn El Applying Eqn El 
to our M-R relation requires evaluating the lower levels 
of the forward model described in Equation [4l 
Part of the subtlety of checking data-model consistency 
arises because our model is hierarchical. In particular, 9 
of Eqn El includes both the population-wide parameters 
and the individual parameters M)- \R) ; we 
can use the posterior samples of either of these groups 
of parameters to calculate Xnew We show the result 
of both choices in Figure El using the posterior sam¬ 
ples from the baseline M-R relation (second line of Table 
[T]) . Conceptually, using samples from the individual true 

mass and radius posteriors (9 = evalu¬ 

ates the fit of the model to the currently observed set 
of planets (the green histograms on the left), while us¬ 
ing posterior samples for the M-R relation parameters 
[9 = {C, 7,crM}) evaluates the fit of the model to alto¬ 
gether new sets of planets (the blue histograms on the 

right). For the former case, the same and as 


the observed dataset are used; for the latter case, 
is drawn from the distribution of ffMob for the observed 

(■i) 

planets that have a similar mass, and is drawn from 
the observed dataset’s full distribution of crjiob without 
controlling for radius (the size of the radius error bars 
are fairly constant across the dataset). 

The second aspect of hierarchical mo del-checking 
which requires some effort is the identification of a ro¬ 
bust statistic to quantify the discrepancy between the 
observed data and the model-generated data. Optimally 
this statistic would test the fit of every part of the mod¬ 
eled probability distribution, including the “average” be¬ 
havior of the model, the “extreme” behavior out on the 
tails of the distribution, and for hierarchical models, the 
accuracy of the grouping and relational structure that 
is illustrated by graphs like Figure [TJ Due to high di¬ 
mensional parameter space, this proves to be very dif¬ 
ficult, so one must identify several statistics which test 
such aspects separately. To illustrate the problem, we 
choose two: fia, the fraction of a given dataset’s simu¬ 
lated mass, radius observations which fall within the 68% 
coverage interval of our probabilistic, baseline M-R rela¬ 
tion (blue region in Figure |4]), and the fraction of a 
given dataset’s mass measurements whose Icr error bars 
cross the mean relation p (see EqnE]) of that same model. 
Therefore, fia tests how well the width of our probabilis¬ 
tic M-R relation fits the data, and tests how tightly 
grouped the data are around the mean compared to the 
normal distribution of our model. 

The results of these tests are displayed in FigureEI The 
observed data’s fia statistic (top) falls at the 63rd and 
83rd percentiles (i.e. within “one sigma”) of the distri¬ 
butions for datasets generated from the individual pos¬ 
teriors and the population-wide posteriors, respectively. 
Therefore, the model is fully consistent with the data for 
the aspect of the model that this statistic tests: the in¬ 
ferred intrinsic scatter of the M-R relation. On the other 
hand, the observed data’s statistic falls at the 2nd and 
0.2th percentile of the distributions. This test is sensi¬ 
tive to the shape of the distribution around the mean 
relation, and thereby probes the appropriateness of our 
assumption of a normal distribution in Eqn El The fact 
that the data are marginally inconsistent with this aspect 
of the model reveals that this is an area of improvement 
for future work (see il6.3l for a further discussion about 
this). 

The differences in the model-data fit implied by these 
two statistics illustrates how careful one must be in per¬ 
forming model checking and interpreting the results. In¬ 
stead of asking “is the model consistent with the data?”, 
a more well-posed question would be “in what ways are 
the model consistent with the data?”, especially as these 
statistical models become more complex to accommodate 
more sources of uncertainty and more realistic physics. 
For the model at hand, we address this question by re¬ 
turning to our purpose. We wanted to explore the need 
for intrinsic scatter in the M-R relation; therefore, we 
care most about the quality of the fit with respect to the 
spread of the M-R relation, i.e. the data-model consis¬ 
tency as quantified by fia- Because this fit is good, we 
are satisfied that our main result holds up to this further 
scrutiny. The data-model discrepancy has implica¬ 
tions for our choice to use a normal distribution for the 
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Fraction of sample in 1 ct region (fiJ Fraction of sample in 1 ct region (fiJ 




Fraction of dataset spanning p (f^) Fraction of dataset spanning p (f^) 

Figure 5. Checking the model against the observed data, us¬ 
ing t wo d ifferent statistics to quantify the data-model consistency 
(see lot . All panels contain 5,000 hypothetical datasets generated 
from the baseline posterior predictive distribution fEons fBl^ . with 
the statistic for the observed dataset (Table as the vertical black 
dashed line. The green histograms on the left show the consistency 
of the model with the current dataset, while the blue histograms 
on the right show the consistency with new hypothetical datasets. 
The top histograms show the distribution of the statistic /i^. The 
observed fi^- falls at the 63rd and 83rd percentiles of the distribu¬ 
tion (left and right, respectively), indicating that the width of the 
model’s M-R relation is consistent with the data. The bottom his¬ 
tograms show the distribution of the statistic . The observed 
falls at the 2nd and 0.2th percentiles of the distribution, indicating 
that there is room for future improvement on the choice of what 
distribution to use for the probabilistic M-R relation. 

probabilistic M-R relation, which we discuss in >16.31 

To wrap up our discussion on hierarchical Bayesian 
model checking, we note that the reader may find it 
surprising that the distribution of fi^ statistics peaks 
around ^ 0.4, given that the statistic was defined by the 
“one-sigma” central region of the M-R relation. The rea¬ 
son why the statistic does not instead cluster around 0.68 
is due to a well-known feature of hier archic a l mod eling 
called shrinkage. First worked out by ISteinI (119551) an d 
developed within a Bayesian framework bv iGoodI (|196^ . 
shrinkage refers to the tendency for individual parame¬ 
ter values at intermediate levels of hierarchical models 
(such as ) to cluster more closely together, or 

“shrink” toward their mean, than if the parameters had 
been analyzed completely independently of each other 
(we see this shrinking visually by comparing the gray and 
red lines in the right panel of Figure |1|). This occurs be¬ 
cause, by design, the hierarchical structure of the model 
causes these individual parameters to be related to each 
other, which then enables information about one parame¬ 
ter to influence our inference for another parameter. This 
additional information, provided solely by the hierarchi¬ 
cal structure, causes the variance among the population 
to decrease relative to the case where the structure, and 
thus the information, was not available. This decrease in 
variance is perceived as a “shrinking” towards the mean. 

We can understand how shrinkage manifests for the M- 


R relation by returning to the forward model defined in 
Eqn[4l First we note that it is the true masses and radii, 
not the observed masses and radii, which are drawn from 
the M-R relation. Therefore, it is the and val¬ 
ues which would produce fia- = 0.68. However, we don’t 
(0 (i) 

actually observe and Rl outright; we observe them 
convolved with some error . Because the convolution 
of a normal distribution with another normal produces a 
wider normal distribution, we would indeed expect more 
of the observed mass and radius points to fall outside of 
the “one-sigma” bounds defined by the top level of the 
model. With this additional insight into the nature of 
hierarchical models, we understand that this behavior is 
not only consistent with what we see, but expected. 

6.3. Caveats and Future Work 

As discussed in mi we made a number of assumptions 
and modeling choices to facilitate a straightforward in¬ 
vestigation into the need for an intrinsic scatter term in 
the sub-Neptune M-R relation. Some of these choices, 
such as parameterizing the relation with a power law or 
using a normal distribution to characterize the scatter, 
were driven by convenience and familiarity rather than 
by physics. In particular, we chose to use a power law 
in order facilitate direct comparisons between our results 
and those in the literature, and we chose the normal dis¬ 
tribution because it directly parameterizes the intrinsic 
scatter in the population rather than relying on a trans¬ 
formation from a more obscure distribution to derive the 
population variance. In general, our philosophy was to 
limit the number of free parameters in our model as much 
as possible in order to maximize the information content 
per parameter. 

That said, these parameterizations are by no means 
the only ones that could be reasonably used. These ap¬ 
proximations can and should be revisited in future work, 
especially as more data becomes available and we begin 
to describe more subtle features in the M-R relation. An 
important aspect of these studies will be mo del-checking 
to inform choices for parameterizing the M-R relation 
(see >16.21) and performing quantitative model compar¬ 
isons. For now, we emphasize that the inclusion of any 
kind of probability distribution to account for intrinsic 
scatter represents a significant improvement over prior 
work, and we leave testing different distributions for fu¬ 
ture studies. 

A related parameterization issue is our choice not to in¬ 
clude any other planet properties into this M-R relation, 
such as a dependence on orbital period. It is entirely 
plausible on both theoretical and observational grounds 
that the M-R relation at short periods may be different 
from that at long periods. Theoretically, photoevapo¬ 
ration is likely to have eroded planets on short orbits 
(see, for example. iMurrav-Clav et al.ll200l : iLonez et al.l 
1201^ iHowe fc: Burrowjl2015ll . thereby causing the pop¬ 
ulation of highly irradiated planets to be denser on aver¬ 
age; alternatively, migration could have produced a mass- 
dependent stopping location give n a certain structure for 
the in ner regions of the disk (e.g. iBenftez-Llambav et al.l 
[Mill or tidal circularization could have pro duced a 
densi ty-period correlation for the shortest orbits (iBarnesI 
1201411 . Observationally, we see a suggestive dearth of > 3 
R 0 Kepler planet candidates and lower-mass RV planets 
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at short orbital periods (|Beauge &: Nesvor'n?ll2013ll and 
a period dependence in the marginalized, bias-corrected 
Kepl e r radius distribution (lYoudiiil 1201 ll : iHoward et alJ 
I 2 OI 2 I : [Morton fc SwiftJl20T^ . These observations hint at 
the potential for interesting features in the joint density- 
period distribution. 

While these results provide strong motivation for fur¬ 
ther investigations into period-dependent M-R relations, 
there will be numerous details to address in fitting the 
more complex period-dependent statistical model to the 
data. We choose to leave these analyses for future work 
given the number of tests that we have already performed 
here (choice of parameterization for the intrinsic scatter, 
RV vs. TTV masses, and different radius ranges). In par¬ 
ticular, this future work will need to robustly account for 
the systematic biases between the different mass mea¬ 
surement methods to make sure that our inferences are 
not driven by arbitrary choices for the data we use (or if 
these choices prove to be unavoidable, to quantify their 
effect). In general, future studies will need to perform 
model comparison to identify the most useful parameter- 
izations given the data. These two efforts pose cutting- 
edge statistical challenges that are worthy of separate in¬ 
vestigations in and of themselves, and so we reserve them 
for follow-up studies that build on the results presented 
here. 

Returning to our current statistical model for the M-R 
relation, our third major modeling choice after the power 
law and Gaussian astrophysical scatter was using Gaus¬ 
sian errors for the observed mass and radius measure¬ 
ments, as denoted in the last line of our statistical model 
(Eqn0]). The most accurate hierarchical analyses include 
the actual likelihood used to infer these planetary param¬ 
eters from the lower-level photometric and spectroscopic 
data, rather than assume such a heavily simplified func¬ 
tional form as we did. Incorporating observers’ true like¬ 
lihoods is important to capture important correlations 
that exist in the measurement analysis and to use all of 
the information contained in the lower-level data. Ide¬ 
ally, future planet discovery papers will make their full, 
joint likelihood distributions available in addition to the 
reported “best-fit” value; depending on the type of anal¬ 
ysis that observers use to make measurements from their 
data, this means providing either log-likelihood (e.g. x^) 
contours over all of the parameters that were considered 
or providing posterior samples along with detailed infor¬ 
mation about the choice of priors. Unfortunately, this 
information is not publicly available, and so we cannot 
use it. Therefore, we follow the convention established 
by WM14 in using a normal distribution to represent the 
marginalized mass or radius likelihood. 

Another notable approximation that we have made 
arises from the difference between the graphical model 
describing our M-R relation (Figure [T]) and the way in 
which we have implemented it (Eqn|3]). The difference 

between the two is subtle: by conditioning on in 

the lowest level of the model, we are assuming that the 
observed mass measurement uncertainty is independent 
from the measurement itself. In reality, the calculation 
of both the uncertainty and the measured value are pro¬ 
duced by the same modeling process and thus are cor¬ 


related in a potentially nontrivial wa~vFl. In practice, 
we had no other choice than to assume independence 
because the nature of this correlation is rarely, if ever, 
published. 

An important caveat about our results is that we ig¬ 
nore selection effects, primarily due to the inability to 
model the human decisions which affected the ground- 
based follow-up observing campaigns. As discussed in 
(J3l we would ideally have a uniform sample of masses 
and radii which were analyzed in the same way, and have 
well-characterized the selection effects and detection effi¬ 
ciencies. Unfortunately, this is simply not possible at the 
present moment. The sample is highly heterogeneous, 
a compilation of many observing teams’ programmatic, 
yearly, and nightly priorities that are not communicated 
in the literature and therefore cannot be accurately mod¬ 
eled. There is significant, difficult work still to be done 
to understand the extent to which the follow-up process 
shapes the observed mass-radius space and therefore in¬ 
fluence our inferences for the underlying M-R relation. 

This is particularly important for interpreting the ap¬ 
parent discrepancy between the population of TTV- 
measured masses and the population of RV-measured 
masses. In 115.21 we corroborate the systematic den¬ 
sity difference betwee n TTV and RV p l anets that 
had been noticed by iJontof-Hutter et ^ (|2014fl and 
iWeiss &: Marcvl (|2014f) . as we find a ^ two “sigma” off¬ 
set in the two datasets’ M-R relation parameters. Since 
we do not attempt to model the selection effects that 
are involved, we cannot distinguish how much of this 
offset is created by observational bias and how much is 
due to an intrinsic difference in the density distribution. 
If the latter turns out to be the driving factor behind 
this apparent discrepancy, disentangling the underlying 
reason for this astrophysical density difference from the 
current list of features which distinguish the two samples 
will have numerous implications for planet formation and 
evolution. 

7. CONCLUSIONS 

In this paper we have defined and constrained a proba¬ 
bilistic mass-radius relationship for sub-Neptune planets 
(Eqn [2] with parameter values in Table [T] and the density 
constraint provided in Eqn [5]). In particular, we demon¬ 
strate that there is intrinsic, astrophysical scatter in this 
relation, and that, except for the smallest planets, this 
scatter is nonzero for all considered datasets. For the 
first time in the exoplanet literature, we display the un¬ 
certainties in the M-R relation parameters through pos¬ 
terior distributions and explain how to properly incor¬ 
porate these uncertainties into a predictive distribution 
of masses for individual planets. This M-R relation will 
be useful for anyone who wishes to perform large-scale 
dynamical or planet formation studies with the Kepler 
planet candidates. 

More broadly, this work provides a framework for fur- 

At first glance this may seem like an inconsequential difference 
considering the other modeling assumptions we have made, but 
the assumption of independence becomes problematic when the 
processes of detection and measurement use the same data and 
when the population analyses use regions of parameter space where 
the de tection efficiency starts to drop (see ILoredo fc Wassermanl 
II199RI 1 for the technical details). We do not correct for detection 
bias in this paper, so are unaffected by this problem. 
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ther analyses of the M-R relation and its probable de¬ 
pendencies on period and stellar properties. Here we 
have demonstrated how to develop and apply a statisti¬ 
cal model that incorporates measurement uncertainties 
into population-wide inference of the sub-Neptune M- 
R relation and that directly produces estimates of the 
uncertainty of the inferred parameters. This method is 
advantageous because it is quantitative and easily gener- 
alizable to include additional variables which may be im¬ 
portant in the underlying M-R relation that we are trying 
to model, such as incident flux l il6.3p or different stellar 
masses or metallicities. We do not investigate these pos¬ 
sibilities here, as we wish to begin this broader effort with 
the simplest reasonable statistical model. Nevertheless, 
searching for additional dependencies in the M-R rela¬ 
tion is an important endeavor in order to understand 
which physical processes shape the super-Earth popula¬ 
tion. With this work we establish both a point of com¬ 
parison and a framework for these further studies. 
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ily reflect the views of the National Science Eoundation. 
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Table 2 

Masses and Radii of Small Planets 


Planet Name 

Period 

(days) 

A4o6s 

(M®) 

^Mobs 

(M®) 

^obs 

(Re) 

^ Robs 

(R©) 

First 

Reference 

Mass, Radius 
Reference 

Note 

55 Cnc e 

0.737 

8.09 

0.26 

2.17 

0.098 

McArthur(2004) 

Nelson(2014), GillorL(2012) 

0 

CoRoT-7 b 

0.854 

4.73 

0.95 

1.58 

0.064 

Queloz(2009); Leger(2009) 

Barros(2014) 

0 

GJ 1214 b 

1.580 

6.45 

0.91 

2.65 

0.09 

Charbonneau(2009) 

Carter(2011) 

0,1 

GJ 3470 b 

3.337 

13.73 

1.61 

3.88 

0.32 

Bonms(2012) 

Biddle(2014) 

0 

HD 97658 b 

9.491 

7.87 

0.73 

2.34 

0.16 

Howard(2011) 

Dragomir(2013) 

0,1 

HIP 116454 b 

9.12 

11.82 

1.33 

2.53 

0.18 

Vanderburg(2015) 

Vanderburg(2015) 

0 

Kepler-lU b 

0.837 

3.33 

0.49 

1.47 

0.02 

Batalha(2011) 

Dumusque(2014) 

0 

Kepler-10 c 

45.294 

17.2 

1.9 

2.35 

0.06 

Batalha(2011) 

Dumusque(2014) 

0 

Kepler-19 b 

9.287 

— 

20.3 

2.21 

0.048 

Borucki(2011) 

Ballard(2011) 

0,4 

Kepler-20 b 

3.696 

8.7 

2.2 

1.91 

0.16 

Borucki(2011) 

Gautier(2012) 

0 

Kepler-20 c 

10.854 

16.1 

3.5 

3.07 

0.25 

Borucki(2011) 

Gautier(2012) 

0 

Kepler-20 d 

77.612 

— 

20.1 

2.75 

0.23 

Borucki(2011) 

Gautier(2012) 

0,4 

Kepler-20 e 

6.098 

— 

3.08 

0.868 

0.08 

Borucki(2011) 

Fressin(2012) 

0,4 

Kepler-20 f 

19.58 

— 

14.3 

1.03 

0.11 

Borucki(2011) 

Fressin(2012) 

0,4 

Kepler-21 b 

2.786 

— 

10.4 

1.635 

0.04 

Borucki(2011) 

Howell(2012) 

0,4 

Kepler-25 b 

6.239 

9.60 

4.20 

2.71 

0.05 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-37 b 

13.367 

2.78 

3.70 

0.32 

0.02 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-37 c 

21.302 

3.35 

4.00 

0.75 

0.03 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-37 d 

39.792 

1.87 

9.08 

1.94 

0.06 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-48 b 

4.778 

3.94 

2.10 

1.88 

0.10 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-48 c 

9.674 

14.61 

2.30 

2.71 

0.14 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-48 d 

42.896 

7.93 

4.60 

2.04 

0.11 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-62 b 

5.715 

— 

9 

1.31 

0.04 

Borucki(2011) 

Borucki(2013) 

0,4 

Kepler-62 c 

12.44 

— 

4 

0.54 

0.03 

Borucki(2013) 

Borucki(2013) 

0,4 

Kepler-62 d 

18.164 

— 

14 

1.95 

0.07 

Borucki(2011) 

Borucki(2013) 

0,4 

Kepler-62 e 

122.39 

— 

36 

1.61 

0.05 

Borucki(2011) 

Borucki(2013) 

0,4 

Kepler-62 f 

267.29 

— 

35 

1.41 

0.07 

Borucki(2013) 

Borucki(2013) 

0,6 

Kepler-68 b 

5.399 

5.97 

1.70 

2.33 

0.02 

Borucki(2011) 

Marcy(2014) 

0,3 

Kepler-68 c 

9.605 

2.18 

3.50 

1.00 

0.02 

Batalha(2013) 

Marcy(2014) 

0,3 

Kepler-78 b 

0.354 

1.69 

0.41 

1.20 

0.09 

Sanchis-Ojeda(2013a) 

Howard(2013) 

0,1 

Kepler-89 b 

3.743 

10.50 

4.60 

1.71 

0.16 

Borucki(2011) 

Weiss(2013) 

0,1 

Kepler-93 b 

4.727 

4.02 

0.68 

1.48 

0.019 

Borucki(2011) 

Dressing(2015) 

0 

Kepler-94 b 

2.508 

10.84 

1.40 

3.51 

0.15 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-95 b 

11.523 

13.00 

2.90 

3.42 

0.09 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-96 b 

16.238 

8.46 

3.40 

2.67 

0.22 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-97 b 

2.587 

3.51 

1.90 

1.48 

0.13 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-98 b 

1.542 

3.55 

1.60 

1.99 

0.22 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-99 b 

4.604 

6.15 

1.30 

1.48 

0.08 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-100 b 

6.887 

7.34 

3.20 

1.32 

0.04 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-100 c 

12.816 

0.85 

4.00 

2.20 

0.05 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-100 d 

35.333 

-4.36 

4.10 

1.61 

0.05 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-101 c 

6.03 

— 

9 

1.25 

0.18 

Borucki(2011) 

Bonomo(2014) 

0,5 

Kepler-102 d 

10.312 

3.80 

1.80 

1.18 

0.04 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-102 e 

16.146 

8.93 

2.00 

2.22 

0.07 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-102 f 

27.454 

0.62 

3.30 

0.88 

0.03 

Borucki(2011) 

Marcy(2014) 

0,1 
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Table 2 — Continued 


Planet Name 

Period 

(days) 

A4o6s 

(Mffi) 

^Mobs 

(M®) 

Robs 

(Re) 

^ Robs 

(R©) 

First 

Reference 

Mass, Radius 
Reference 

Note 

Kepler-102 b 

5.287 

0.41 

1.60 

0.47 

0.02 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-102 c 

7.071 

-1.58 

2.00 

0.58 

0.02 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-103 b 

15.965 

14.11 

4.70 

3.37 

0.09 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-106 b 

6.165 

0.15 

2.80 

0.82 

0.11 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-106 c 

13.571 

10.44 

3.20 

2.50 

0.32 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-106 d 

23.980 

-6.39 

7.00 

0.95 

0.13 

Batalha(2013) 

Marcy(2014) 

0,1 

Kepler-106 e 

43.844 

11.17 

5.80 

2.56 

0.33 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-109 b 

6.482 

1.30 

5.40 

2.37 

0.07 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-109 c 

21.223 

2.22 

7.80 

2.52 

0.07 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-113 b 

4.754 

7.10 

3.30 

1.82 

0.05 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-113 c 

8.925 

-4.60 

6.20 

2.19 

0.06 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-131 b 

16.092 

16.13 

3.50 

2.41 

0.20 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-131 c 

25.517 

8.25 

5.90 

0.84 

0.07 

Batalha(2013) 

Marcy(2014) 

0,1 

Kepler-406 b 

2.426 

4.71 

1.70 

1.43 

0.03 

Borucki(2011) 

Weiss(2014) 

0,1 

Kepler-406 c 

4.623 

1.53 

2.30 

0.85 

0.03 

Batalha(2013) 

Weiss(2014) 

0,1 

Kepler-407 b 

0.669 

0.06 

1.20 

1.07 

0.02 

Borucki(2011) 

Marcy(2014) 

0,1 

Kepler-409 b 

68.958 

2.69 

6.20 

1.19 

0.03 

Batalha(2013) 

Marcy(2014) 

0,1 

Kepler-4 b 

3.213 

24.47 

3.81 

4.00 

0.21 

Borucki(2010) 

Borucki(2010) 


GJ 436 b 

2.64 

25.4 

2.1 

4.10 

0.16 

Butler(2004) 

Lanotte(2014) 


Kepler-89 c 

10.42 

15.6 

10.6 

4.32 

0.41 

Batalha(2013) 

Weiss(2013) 


HAT-P-11 b 

4.888 

25.74 

2.86 

4.73 

0.157 

Bakos(2010) 

Bakos(2010) 


CoRoT-22 b 

9.756 

— 

35 

4.88 

0.28 

Moutou(2014) 

Moutou(2014) 

4 

Kepler-103 c 

179.61 

36.1 

25.2 

5.14 

0.14 

Borucki(2011) 

Marcy(2014) 


Kepler-101 b 

3.488 

51.1 

4.9 

5.77 

0.82 

Borucki(2011) 

BorLomo(2014) 


Kepler-63 b 

9.43 

— 

95 

6.1 

0.2 

Borucki(2011) 

Sanchis-Ojeda(2013b) 

6 

HAT-P-26 b 

4.235 

18.75 

2.23 

6.33 

0.58 

Hartman(2011) 

Hartman(2011) 


(JoRo'l‘-8 b 

6.212 

69.92 

9.53 

6.39 

0.22 

Borde(2010) 

Borde(2010) 


Kepler-89 e 

54.32 

35 

23 

6.56 

0.62 

Batalha(2013) 

Weiss(2013) 


Kepler-11 b 

10.304 

1.90 

1.2 

1.80 

0.04 

Lissauer(2011) 

Lissauer(2013) 

1,2 

Kepler-11 c 

13.024 

2.90 

2.3 

2.87 

0.06 

Lissauer(2011) 

Lissauer(2013) 

1,2 

Kepler-11 d 

22.684 

7.30 

1.2 

3.12 

0.07 

Lissauer(2011) 

Lissauer(2013) 

1,2 

Kepler-11 f 

46.689 

2.00 

0.9 

2.49 

0.06 

Lissauer(2011) 

Lissauer(2013) 

1,2 

Kepler-11 g 

118.38 

— 

25 

3.33 

0.07 

Lissauer(2011) 

Lissauer(2013) 

2,4 

Kepler-18 b 

3.505 

6.9 

3.4 

2.00 

0.100 

Borucki(2011) 

Cochran(2011) 

1,2 

Kepler-30 b 

29.334 

11.3 

1.4 

3.90 

0.200 

Borucki(2011) 

Sanchis-Ojeda(2012) 

1,2 

Kepler-36 b 

13.840 

4.45 

0.30 

1.486 

0.035 

Carter(2012) 

Carter(2012) 

1,2 

Kepler-36 c 

16.239 

8.08 

0.53 

3.679 

0.054 

Borucki(2011) 

Carter(2012) 

1,2 

Kepler-79 b 

13.485 

10.9 

6.7 

3.47 

0.07 

Borucki(2011) 

Jontof-Hutter(2014) 

1,2 

Kepler-79 c 

27.403 

5.9 

2.1 

3.72 

0.08 

Borucki(2011) 

Jontof-Hutter(2014) 

1,2 

Kepler-79 e 

81.066 

4.1 

1.2 

3.49 

0.14 

Batalha(2013) 

Jontof-Hutter(2014) 

1,2 

Kepler-88 b 

10.954 

8.7 

2.5 

3.78 

0.38 

Borucki(2011) 

Nesvorny(2013) 

2 

Kepler-138 c 

13.782 

3.83 

1.39 

1.610 

0.160 

Borucki(2011) 

Kipping(2014) 

2 

Kepler-138 d 

23.089 

1.01 

0.38 

1.610 

0.160 

Borucki(2011) 

Kipping(2014) 

2 

Kepler-289 b 

34.545 

7.3 

6.8 

2.15 

0.1 

Borucki(2011) 

Schmitt(2014) 

2 

Kepler-289 d 

66.063 

4.0 

0.9 

2.68 

0.17 

Borucki(2011) 

Schmitt(2014) 

2 


Note. — 


0. Included in baseline dataset, which consists of RV masses (see 

1. Mass, radius values and their error bars are unchanged (within rounding error) from WM14. 

2. Mass measured by fitting the observed TTVs to N-body integrations of the system. 

3. The Kepler-68 planets were repeated twice in the WM14 dataset, so we use the lMarcv et all 1)20141) values. 

4. The cFMobs column contains the 2cr upper limit as reported in the second reference. 

5. Only a la upper limit of 3.78 was given, and no posteriors were shown; in this analysis, we set the 2cr upper 
limit at 9 Mq to include 1.8 m/s uncertainty quoted in RV semi-amplitude for the larger Kepler-101 b. 

6 . The 2a upper limit is interpolated from given la and 3cr upper limits. 










